Design alternatives for shared memory multiprocessors

نویسندگان

  • John B. Carter
  • Chen-Chi Kuo
  • Ravindra Kuramkote
  • Mark R. Swanson
چکیده

In this paper, we consider the design alternatives available for building the next generation DSM machine (e.g., the choice of memory architecture, network technology, and amount and location of per-node remote data cache). To investigate this design space, we have simulated five applications on a wide variety of possible DSM architectures that employ significantly different caching techniques. We also examine the impact of using a special-purpose system interconnect designed specifically to support low latency DSM operation versus using a powerful off the shelf system interconnect. We found that two architectures have the best combination of good average performance and reasonable worst case performance: CC-NUMA employing a moderate-sized DRAM remote access cache (RAC) and a hybrid CC-NUMA/S-COMA architecture called AS-COMA or adaptive S-COMA. Both pure CC-NUMA and pure SCOMA have serious performance problems for some applications, while CC-NUMA employing an SRAM RAC does not perform as well as the two architectures that employ larger DRAM caches. The paper concludes with several recommendations to designers of next-generation DSM machines, complete with a discussion of the issues that led to each recommendation so that designers can decide which ones are relevant to them given changes in technology and corporate priorities. *Mark Swanson is now at Intel Corporation. Current email addresses: [email protected] This research was supported in part by the Space and Naval Warfare Systems Command (SPAWAR) and the Advanced Research Projects Agency (ARPA), under SPAWAR contract No.#N0039-95-C-0018 and ARPA Order No.#B990. The views and conclusions contained herein are those of the authors and should not be interpreted as necessariy representing the official policies or endorsements, either expressed or implied, of DARPA, the Air Force Research Laboratory, or the US Government.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Execution-Driven Simulation of Shared-Memory Multiprocessors

This paper describes an eecient execution-driven technique for the simulation of shared-memory multiprocessors driven by real programs. Our simulator ooers substantial advantages in terms of reduced time and space overheads when compared to instruction-driven or trace-driven simulation techniques, without signiicant loss of accuracy. The technique produces correctly interleaved address traces a...

متن کامل

Modeling and Performance Evaluation of Multi-Processors Organization with Shared Memories

This paper is primarily concerned with theoretical evaluation of the performance of multiprocessors system. A markovian waiting line model has been developed for various different multi-processors configurations, with shared memory. The system is analysed at the request level rather than job level.

متن کامل

Software Caching on Cache-Coherent Multiprocessors

Programmers have always been concerned with data distribution and remote memory access costs on shared-memory multiprocessors that lack coherent caches, like the BBN Butterry. Recently memory latency has become an important issue on cache-coherent multiprocessors, where dramatic improvements in microprocessor performance have increased the relative cost of cache misses and coherency transaction...

متن کامل

Classifying Software-Based Cache Coherence Solutions

The authors propose a classification for software solutions to cache coherence in shared memory multiprocessors and show how it can be applied to more completely understand existing approaches and explore possible alternatives.

متن کامل

Evaluation of Design Alternatives for a Directory-Based Cache Coherence Protocol in Shared-Memory Multiprocessors

In shared-memory multiprocessors, caches are attached to the processors in order to reduce the memory access latency. To keep the memory consistent, a cache coherence protocol is needed. A well known approach is to record which caches have copies of a memory block in a directory and only notify the caches having a copy when a processor modifies the block. Such a protocol is called a directory-b...

متن کامل

A Study on the Impact of Memory Consistency Models on Parallel Algorithms for Shared-Memory Multiprocessors

Memory consistency model is an integral part of the shared-memory multiprocessor system, and directly affects the performance. Most current multiprocessors adopt relaxed consistency models in quest of higher performance. In this paper we study the impact of memory consistency model on the design, implementation and performance of parallel algorithms for graph problems that remain challenging du...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998